Search CORE

13 research outputs found

Diagnosing and Augmenting Feature Representations in Correctional Inverse Reinforcement Learning

Author: Bobu Andreea
Lourenço Inês
Rojas Cristian R.
Wahlberg Bo
Publication venue
Publication date: 11/04/2023
Field of study

Robots have been increasingly better at doing tasks for humans by learning from their feedback, but still often suffer from model misalignment due to missing or incorrectly learned features. When the features the robot needs to learn to perform its task are missing or do not generalize well to new settings, the robot will not be able to learn the task the human wants and, even worse, may learn a completely different and undesired behavior. Prior work shows how the robot can detect when its representation is missing some feature and can, thus, ask the human to be taught about the new feature; however, these works do not differentiate between features that are completely missing and those that exist but do not generalize to new environments. In the latter case, the robot would detect misalignment and simply learn a new feature, leading to an arbitrarily growing feature representation that can, in turn, lead to spurious correlations and incorrect learning down the line. In this work, we propose separating the two sources of misalignment: we propose a framework for determining whether a feature the robot needs is incorrectly learned and does not generalize to new environment setups vs. is entirely missing from the robot's representation. Once we detect the source of error, we show how the human can initiate the realignment process for the model: if the feature is missing, we follow prior work for learning new features; however, if the feature exists but does not generalize, we use data augmentation to expand its training and, thus, complete the correction. We demonstrate the proposed approach in experiments with a simulated 7DoF robot manipulator and physical human corrections.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

Teaching Robots to Span the Space of Functional Expressive Motion

Author: Bobu Andreea
Brown Daniel S.
Dragan Anca D.
Li Zhongyu
Sreenath Koushil
Sripathy Arjun
Publication venue
Publication date: 02/08/2022
Field of study

Our goal is to enable robots to perform functional tasks in emotive ways, be it in response to their users' emotional states, or expressive of their confidence levels. Prior work has proposed learning independent cost functions from user feedback for each target emotion, so that the robot may optimize it alongside task and environment specific objectives for any situation it encounters. However, this approach is inefficient when modeling multiple emotions and unable to generalize to new ones. In this work, we leverage the fact that emotions are not independent of each other: they are related through a latent space of Valence-Arousal-Dominance (VAD). Our key idea is to learn a model for how trajectories map onto VAD with user labels. Considering the distance between a trajectory's mapping and a target VAD allows this single model to represent cost functions for all emotions. As a result 1) all user feedback can contribute to learning about every emotion; 2) the robot can generate trajectories for any emotion in the space instead of only a few predefined ones; and 3) the robot can respond emotively to user-generated natural language by mapping it to a target VAD. We introduce a method that interactively learns to map trajectories to this latent space and test it in simulation and in a user study. In experiments, we use a simple vacuum robot as well as the Cassie biped

arXiv.org e-Print Archive

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

Author: Agrawal Pulkit
Bobu Andreea
Ho Mark
Netanyahu Aviv
Peng Andi
Shah Julie
Shu Tianmin
Publication venue
Publication date: 12/07/2023
Field of study

Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences about how the task is performed. We propose an interactive framework to leverage feedback directly from the user to identify personalized task-irrelevant concepts. Our key idea is to generate counterfactual demonstrations that allow users to quickly identify possible task-relevant and irrelevant concepts. The knowledge of task-irrelevant concepts is then used to perform data augmentation and thus obtain a policy adapted to personalized user objectives. We present experiments validating our framework on discrete and continuous control tasks with real human users. Our method (1) enables users to better understand agent failure, (2) reduces the number of demonstrations required for fine-tuning, and (3) aligns the agent to individual user task preferences.Comment: International Conference on Machine Learning (ICML) 202

arXiv.org e-Print Archive

Learning Perceptual Concepts by Bootstrapping from Human Queries

Author: Bobu Andreea
Cakmak Maya
Chao Yu-Wei
Fox Dieter
Paxton Chris
Sundaralingam Balakumar
Yang Wei
Publication venue
Publication date: 04/07/2022
Field of study

When robots operate in human environments, it's critical that humans can quickly teach them new concepts: object-centric properties of the environment that they care about (e.g. objects near, upright, etc). However, teaching a new perceptual concept from high-dimensional robot sensor data (e.g. point clouds) is demanding, requiring an unrealistic amount of human labels. To address this, we propose a framework called Perceptual Concept Bootstrapping (PCB). First, we leverage the inherently lower-dimensional privileged information, e.g., object poses and bounding boxes, available from a simulator only at training time to rapidly learn a low-dimensional, geometric concept from minimal human input. Second, we treat this low-dimensional concept as an automatic labeler to synthesize a large-scale high-dimensional data set with the simulator. With these two key ideas, PCB alleviates human label burden while still learning perceptual concepts that work with real sensor input where no privileged information is available. We evaluate PCB for learning spatial concepts that describe object state or multi-object relationships, and show it achieves superior performance compared to baseline methods. We also demonstrate the utility of the learned concepts in motion planning tasks on a 7-DoF Franka Panda robot.Comment: 9 pages, 10 figure

arXiv.org e-Print Archive

Getting aligned on representational alignment

Biological and artificial information processing systems form representations that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the extent to which the representations formed by these diverse systems agree? Do similarities in representations then translate into similar behavior? How can a system's representations be modified to better match those of another system? These questions pertaining to the study of representational alignment are at the heart of some of the most active research areas in cognitive science, neuroscience, and machine learning. For example, cognitive scientists measure the representational alignment of multiple individuals to identify shared cognitive priors, neuroscientists align fMRI responses from multiple individuals into a shared representational space for group-level analyses, and ML researchers distill knowledge from teacher models into student models by increasing their alignment. Unfortunately, there is limited knowledge transfer between research communities interested in representational alignment, so progress in one field often ends up being rediscovered independently in another. Thus, greater cross-field communication would be advantageous. To improve communication between these fields, we propose a unifying framework that can serve as a common language between researchers studying representational alignment. We survey the literature from all three fields and demonstrate how prior work fits into this framework. Finally, we lay out open problems in representational alignment where progress can benefit all three of these fields. We hope that our work can catalyze cross-disciplinary collaboration and accelerate progress for all communities studying and developing information processing systems. We note that this is a working paper and encourage readers to reach out with their suggestions for future revisions.Comment: Working paper, changes to be made in upcoming revision

arXiv.org e-Print Archive

Recommended from our members

Aligning Robot Representations with Humans

Author: Bobu Andreea
Publication venue: eScholarship, University of California
Publication date: 01/01/2023
Field of study

Robots are becoming increasingly weaved into the fabric of our society, from self-driving cars on our streets to assistive manipulators in our homes. To act in the world, robots rely on a representation of salient features of the task: for example, to hand me a cup of coffee, the robot considers movement efficiency and cup orientation in its behavior. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. their representations must be aligned with humans'. What's holding us back from successful human-robot interaction is that these representations are often misaligned, resulting in anything from miscoordination and misunderstandings, to learning and executing dangerous behaviors.To learn the human's representation of what matters in a task, typical methods rely on data sets of human behavior but this data cannot reflect every individual, environment, and task the robot will be exposed to. This dissertation advocates that we should instead treat humans as active participants in the interaction not as static data sources: robots must engage with humans in an interactive process for finding a shared representation. We formalize the representation alignment problem as a joint search for a common representation. Then, rather than hoping that representations will naturally be aligned, we propose having humans directly teach them to robots with representation-specific input. Next, we enable robots to automatically detect representation misalignment with the human by estimating a confidence over how much the robot's representation can explain the human's behavior. We demonstrate how human-aligned representations can lead to novel human behavior models with broad implications beyond robotics, to econometrics and cognitive science. Finally, this thesis concludes by asking ``How can robots help the human-robot team converge to a shared representation?'' and discusses opportunities for future work in expanding representation alignment for seamless human-robot interaction

eScholarship - University of California

Aligning Robot Representations with Humans

Author: Bobu Andreea
Publication venue
Publication date: 01/08/2023
Field of study

Ezid